Feature Bagging for Author Attribution

نویسندگان

  • François-Marie Giraud
  • Thierry Artières
چکیده

The authorship attribution literature demonstrates the difficulty to design classifiers overcoming simple strategies such as linear classifiers operating on a number, most frequent, of lexical features such as character trigrams. We claim this comes, at least partially, from the difficulty to efficiently learn the contribution of all features, which leads to either undertraining or overtraining of classifiers. To overcome this difficulty we propose to use bagging techniques that rely on learning classifiers on different random subset of features, then to combine their decision by making them vote.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

سبک اسناد در بیماران مبتلا به همبودی اضطراب و افسردگی

  Objective : The present study is investigating the attribution style in patients with the anxiety and depression comorbidity. Method: Subjects are 26 patients with major depression, 25 patients with generalized anxiety disorder, 17 patients with comorbidity of anxiety and depression, and 30 normal individuals. The aparatus used in the study for data collecting were Beck Depression Inventory, ...

متن کامل

Bagging and Feature Selection for Classification with Incomplete Data

Missing values are an unavoidable issue of many real-world datasets. Dealing with missing values is an essential requirement in classification problem, because inadequate treatment with missing values often leads to large classification errors. Some classifiers can directly work with incomplete data, but they often result in big classification errors and generate complex models. Feature selecti...

متن کامل

Classification of Brain Glioma by Using SVMs Bagging with Feature Selection

The degree of malignancy in brain glioma needs to be assessed by MRI findings and clinical data before operations. There have been previous attempts to solve this problem by using fuzzy max-min neural networks and support vector machines (SVMs), while in this paper, a novel algorithm named PRIFEB is proposed by combining bagging of SVMs with embedded feature selection for its individuals. PRIFE...

متن کامل

Rough Sets and Confidence Attribute Bagging for Chinese Architectural Document Categorization

Aiming at the problems of the traditional feature selection methods that threshold filtering loses a lot of effective architectural information and the shortcoming of Bagging algorithm that weaker classifiers of Bagging have the same weights to improve the performance of Chinese architectural document categorization, a new algorithm based on Rough set and Confidence Attribute Bagging is propose...

متن کامل

A 3d Model Retrieval Algorithm Based on Bp- Bagging

Aim at solving the existing problems of 3D model retrieval based on neural network, this paper proposes a new algorithm based on BP-bagging. Through bagging, the algorithm turns the weak classifier into the strong. As to feature extraction, the algorithm projections 3D model into six 2D images by six perspective points. Then transforms the images into frequency domain, gets the high dimension f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012